-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Go site 2210 gorule 0000027 must check dbs are in the db xref file #677
Go site 2210 gorule 0000027 must check dbs are in the db xref file #677
Conversation
Linking to geneontology/go-site#2210 |
ontobio/io/gpadparser.py
Outdated
@@ -235,6 +239,8 @@ def parse_line(self, line): | |||
references = self.validate_curie_ids(assoc.evidence.has_supporting_reference, split_line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mugitty Is there any overlap of logic between self.validate_curie_ids
here and self._validate_curie_using_db_xrefs
just below? Could self._validate_curie_using_db_xrefs
be incorporated into self.validate_curie_ids
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dustine32 , validate_curie_ids calls _validate_id . This checks for things like "DB:id" or for annotations, prefix has to be in GO id space. However, it does not validate against the syntax pattern specified in the db-xrefs file. This is meant as a catch-all for any identifier in the GAF line that has a database field and id_syntax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mugitty Would there be any issue if you incorporated this db-xrefs syntax pattern checking inside _validate_id
? It looks like _validate_id
assumes every id
is a CURIE, which by definition should always have colon-separated database field and id_syntax (else an error will be reported). The only complication I could think of is if a metadata/db-xrefs
is not supplied when _validate_id
is called but you could just make this optional in _validate_id
(i.e., if self.config.db_type_name_regex_id_syntax is not None
).
I guess my main concern is separating validation logic throughout the code that really should be in the same place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dustine32, as you suggested, I could incorporate in _validate_id and add a check for self.config.db_type_name_regex_id_syntax is not None.
Let me update
@mugitty Do we already have a test for gorule 0000027 somewhere? |
@dustine32 , I pushed updates to combine id syntax change into _validate_id method |
@mugitty Awesome, thank you so much! Do we have a test for wrong IDs that should return that "does not match any id_syntax patterns" warning message? It would be good to confirm your new code can be triggered by this. |
I ran it through https://github.com/geneontology/go-site/blob/master/docs/gorules_test_errors.gaf . The output error.json file contains things such as This update outputs warnings. Groups can update the id_syntax and or ids based on these errors |
@mugitty Thanks again! It's great that you confirmed the |
@dustine32, since these are validation tests, I added tests to both test_gafparser.py and test_gpad_parser.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mugitty Thanks so much for your patience and thanks for adding the tests! This all looks good now.
No description provided.